Detecting Stealth Web Pages That Use Click-Through Cloaking
نویسندگان
چکیده
Search spam is an attack on search engines’ ranking algorithms to promote spam links into top search ranking that they do not deserve. Cloaking is a wellknown search spam technique in which spammers serve one page to search-engine crawlers to optimize ranking, but serve a different page to browser users to maximize potential profit. In this experience report, we investigate a different and relatively new type of cloaking, called Click-Through Cloaking, in which spammers serve non-spam content to browsers who visit the URL directly without clicking through search results, in an attempt to evade spam detection by human spam investigators and anti-spam scanners. We survey different cloaking techniques actually used in the wild and classify them into three categories: server-side, client-side, and combination. We propose a redirection-diff approach to spam detection by turning spammers’ cloaking techniques against themselves. Finally, we present eight case studies in which we used redirection-diff in IP subnetbased spam hunting to defend a major search engine against stealth spam pages that use click-through cloaking.
منابع مشابه
Detecting Arabic Cloaking Web Pages Using Hybrid Techniques
Many challenges are emerging in the every day expanding Internet environment, whether for the Internet users or the Web sites owners. The Internet users need to retrieve the high quality relevant information which are relevant to their queries within a short period of time, in order to be a regular users who satisfied by search engine performance. While the Web site owners aim in most cases to ...
متن کاملImproving Cloaking Detection using Search Query Popularity and Monetizability
Cloaking is a search engine spamming technique used by some Web sites to deliver one page to a search engine for indexing while serving an entirely different page to users browsing the site. In this paper, we show that the degree of cloaking among search results depends on query properties such as popularity and monetizability. We propose estimating query popularity and monetizability by analyz...
متن کاملCloaker Catcher: A Client-based Cloaking Detection System
Cloaking has long been exploited by spammers for the purpose of increasing the exposure of their websites. In other words, cloaking has long served as a major malicious technique in search engine optimization (SEO). Cloaking hides the true nature of a website by delivering blatantly different content to users versus web crawlers. Recently, we have also witnessed a rising trend of employing cloa...
متن کاملCloaking and Redirection: A Preliminary Study
Cloaking and redirection are two possible search engine spamming techniques. In order to understand cloaking and redirection on the Web, we downloaded two sets of Web pages while mimicking a popular Web crawler and as a common Web browser. We estimate that 3% of the first data set and 9% of the second data set utilize cloaking of some kind. By checking manually a sample of the cloaking pages fr...
متن کاملDetecting Cloaking Web Spam Using Hash Function
Web spam is an attempt to boost the ranking of special pages in search engine results. Cloaking is a kind of spamming technique. Previous cloaking detection methods based on terms/links differences between crawler and browser’s copies are not accurate enough. The latest technique is tag-based method. This method could find cloaked pages better than previous algorithms. However, addressing the c...
متن کامل